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We investigate the moment estimation for an ergodic diffusion process with unknown trend coef- 
ficient. We consider nonparametric and parametric estimation. In each case, we present a lower 
bound for the risk and then construct an asymptotically efficient estimator of the moment type 
functional or of a parameter which has a one-to-one correspondence to such a functional. Next, 
we clarify a higher order property of the moment type estimator by the Edgeworth expansion 
of the distribution function. 
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1. Introduction 

Suppose that a diffusion process {X t ,0 <t<T} is uniquely defined by the stochastic 
differential equation 

dX t = S(X t )dt + a(X t )dW u 0<t<T, (1) 

where {Wt,t > 0} is a standard Wiener process, and Xq is an initial random variable 
independent of {Wt, t > 0}. We assume that {X t , < t < T} is ergodic with the invariant 
probability distribution /15 (depending on S and a). 

In our statistical problem, the diffusion coefficient a is known to the observer, while 
the trend coefficient S is unknown. Given a function F : R — > R, we want to estimate the 
parameter 



tf = tf s =E s [F(£)] = J F(y)» s (dy), 



where E5 denotes the expectation with respect to fis, and £ denotes a "stationary" 
random variable with distribution ^g. If X t is stationary, we take £ = Xq; if not, we may 
enlarge the original probability space to realize £. 
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We are interested in the asymptotically efficient estimation of d based on the ob- 
servations {X t ,0 <t< T} as T — ► oo. We will consider nonparametric and parametric 
estimation problems. 

Section 2 treats a nonparametric case where the function S is completely unknown. 
As usual in such problems, we derive a minimax bound of the risks of all estimators and 
then show that the empirical moment estimator 

r T = ±J%(x t )dt 

is asymptotically efficient in that "dtp attains this lower bound. 

In Section 3, we suppose that the function S belongs to a parametric family, that 
is, S = 5(7,-), 7 G T, and ■& is a function of j-.-d — #(7). Then it turns out that the 
maximum likelihood estimator (MLE) jt (under certain regularity conditions) provides 
the asymptotically efficient estimator $y = i?(7r) of However, the computation of the 
MLE in nonlinear models is often not easy. To avoid this drawback, we consider the so- 
called one-step MLE introduced by LeCam [6] , which allows us to improve the empirical 
moment estimator to an asymptotically efficient one in this parametric setting. Here the 
asymptotic efficiency is in the sense of Hajek and LeCam. 

After investigations of the first-order asymptotics, a natural direction of study is the 
second-order approximation of the distribution of the estimator. In Section 4, we derive 
an asymptotic expansion formula with the help of the local approach of the Malliavin 
calculus developed recently by Yoshida [12, 13, 14] and Kusuoka and Yoshida [2]. 



2. Nonparametric estimation 

We consider the diffusion process {X tl 0<t<T} described in Section 1. 

In this section, S is unknown, while a is a known continuous positive function. Let us 
denote by S the class of functions S that satisfies 

V(S,x) = J cxp|-2^' y ^^di)|dj/^±oo asuioo (2) 

and 

G(S) = J°° a(y)- 2 cxp|2 jf ^ dv} dy < 00. (3) 

These conditions guarantee the existence of invariant probability measure us with the 
density function 

fs(x) = G(5)-V(x)- 2 cxp|2 £ ^ dt>} (4) 

and the law of large numbers. We suppose that the initial value Xo = £ has a probability 
density function /(•), in particular, the process X t , t > 0, is stationary. 
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We would like to estimate the mathematical expectation 



F(y)f s (y)dy. 



To construct a lower minimax bound on the risks of all estimators, we define a nonpara- 
metric vicinity of a fixed model as follows. Fix some S*(-) € S and S > 0, and introduce 
the set 



V 6 = iS(-):sup\S{x)-S„(x)\<6}. 
I xeR J 

We suppose that is such that the conditions (2) and (3) are fulfilled for all S(-) E Vs 
and 



sup G(S) < oo, 
S(.)eVj 



sup E s |F(0|<oo. 

s(.)ev 4 



The role of Fisher information in our problem will be played by the quantity 



1(5) 



4E.< 



(5) 



(6) 



where Ms(y) = E,s[(F(£) — $s)X{Z<v}}- We put F = I(<5*). We choose the polynomial 
loss function ^(u) = p > 0. 



Theorem 1. Lei £/ie conditions (5) be fulfilled and let I* > 0. TTien 
Hm Hm inf sup E s [£(T 1/2 (tf T - t? s ))] > E[£(r ? F 1/2 ) 



(7) 



where C{rf) = A/"(0, 1) and inf is taken over all possible estimators $t of the unknown 
parameter. 

Proof. We follow the well known scheme described in Ibragimov and Khasminskii ([1], 
Chapter 4); see also the proofs of similar results in Kutoyants [4]. 

Denote = $s* an d let ?/>(•) be a continuous function with compact support such that 
Sh(-) =&(•) + (fe — i?*)V(-)°"(-) 2 e V S for aU h ^ (i?* -7,i?*+7)- Here 7>0 is a number 
chosen in such a way that Shi:) e ^5- For the process (1) with S(-) = Sh(-) we consider 
the parameter estimation problem for h and recall the construction of the Hajek-LeCam 
minimax bound in this situation. 

The direct expansion of the function -dh = $s,+(h~tf)4>(A<T(A 2 by the powers of h — 
gives the representation 



0/i=0.+2(&-i?,) 



E 



F(0 / V(«)du 



tf«E 



?/>(i>) di> 



■o(fc-0„), 
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where we write E = Eg, and = Eg^. Therefore, if we take ip(-) from the class 



[F(0 - i?,] / i'(v) dv 



= 2- , 



then $/j = h + o(h — $*). The corresponding family of measures {P^ ($* — 7, ^* +7)} 

is locally asymptotically normal (LAN) at the point h = therefore, we can apply the 
Hajek-LeCam inequality (see Ibragimov and Khasminski [1], Theorem 2.12.1) and write 
for if) G /C, 

lim lim inf sup E s [^(T 1/2 (i? r - #<?))] 

^OT-^oo #t S(-)€Vj 

> lim hm inf sup E^T 1 / 2 ^ - d h ))} > E^falT 1 / 2 )]. 

7-"-0t->oo # T |/j_^| <7 

Here 77 ~ iV(0,l) and 1^ = E(-0(^)cr(^)) 2 is the Fisher information in the problem of 
estimation of h. 

Furthermore, using the intcgration-by-parts formula and the Cauchy-Schwarz inequal- 
ity we can write 



{F(z)-#.)fs.(z)dzrl>{v)dy 



—00 •/ —00 
v 



(F(z)-0.)f s .(z)dz / ^(V)dv 



< I 



1/2 



E 



Therefore, 



> <^ 4E 



*(0/s.(£) 



1/2 



*(O/s.(0 



We will have the equality if we choose 



(F(y)-^)fsAy)dy 



with the corresponding normalizing constant C > 0, but this function does not have 
compact support and, therefore, it cannot belong to K,. As usual in such situation (see 
Ibragimov and Khasminski [1], page 218), we introduce a sequence of smooth functions 
{V'JvC - )} w ith compact supports approximating ip*(-) and such that 



inf I 

V>(-)€K 



lim = I* 

N— i-oo 
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Then 



and we have the desired estimate (7) 



sup E£( V I- 1/2 ) = E£(r 1 I- 1/2 ) 



□ 



Definition 1. We say that the estimator §t is asymptotically efficient for the loss 
function £(■) if 



lim lim sup E s £(T 1/2 (^ r -i? s )) = E^(?7l/ /z ) 



l/2x 



<5^0T^oo 



S(-)£Va 



(8) 



for all functions S>(-) € S. 



Below we will show that the empirical estimator is an asymptotically efficient estimator 
of $ in this sense. 

The process {X t }t£R + is stationary; therefore the estimator is unbiased: Es$y = 
ds- If the initial value Xq has another distribution, then the estimator i?^ i s n0 longer 
unbiased but nevertheless we have |Es$;ip — ds\ < ^ (see Kutoyants [3], Lemma 3.4.8) 
and the properties of the estimator flj, can be studied without assumption of stationarity 
as well. 

Let us introduce the function 

"V o 



H s (y) 



/o cr(x) 2 f S {x) 

and the conditions that, for some > 0, 



sup E S \H S (0\ P ' <oo, 
s(-)ev s 

Let 



[F(v)-d s }fs(v)dvdx 



Qs(y) 



sup Es 



2M s (y) 

<r(y)fs(y)' 



M S (0 



< oo. 



(9) 



Theorem 2. Suppose that conditions (5) and (9) for some p st >2 are fulfilled, that the 
Fisher information 1(5) is continuous at S(-) = S*(-) and that the law of large numbers 
holds uniformly in S(-) <G Vg, that is, for each k > 0, 



lim sup Pf( i / Q s (X t ) 2 dt-E s Qs(0 2 > 4 = 0. 
T ^°°s(-)ev s I 1 Jo J 



(10) 



Then the empirical moment i?J is an asymptotically efficient estimator of the parameter 
fls under the loss function £(u) = \u\ p with p < p». 
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Proof. Using the ltd formula, we rewrite the difference t]t = T 1 / 2 (i9^ — $s) with a 
stochastic integral as 

H S (X T ) - H s (Xq) 1 f T fY , AW . ri1 x 

T] T = '-j= VTJ ® S ( Xt ^ dWt V- 1 ) 

under Pg. The uniform version of the central limit theorem for stochastic integral (Ku- 
toyants [3], Theorem 3.3.7) allows us to show the weak convergence 



uniform in S(-) £ Vs- Therefore, the empirical estimator is uniformly asymptotically nor- 
mal 

Cs{TV\r T - Vs)} ^N{Q,l{S)~ l ). (12) 

For the polynomial loss functions we have to verify the uniform integrability of the family 
of random variables {\tit\ p ,T ^ oo}, but it follows from the representation (11) and the 
conditions (9) as it was done in Kutoyants [5]. □ 

Example 1. Let us consider the case where a(y) = 1 and, for some p > 0, A > and 
L > 0, define the class of functions 

S(g,A,L) = {S(-):\S(y)-S(z)\<L\y-z\ for \y\,\z\<A, 

sgn(y)S(y) < -g, for \y\ > A}. 

Then, for the function F(y) = \y\ k with any k > 0, we can verify all the conditions of 
Theorem 2 to show that the estimator 

r T = ±£\x t \ k dt 

is consistent, uniformly (in S(-) £ Vs with S < 7) asymptotically normal and asymptoti- 
cally efficient for the polynomial loss functions £(u) = \u\ p with any p > 0. The verification 
is quite close to the one given in Kutoyants [5] . 

Remark 1. If we put F(y) = X{y<x}, then ■& = D(x) = P{£ < x} is the value of the in- 
variant distribution function at point x. Theorems 1 and 2 yield the asymptotic efficiency 
of the empirical distribution function 

£t(*) = ~ f\ {Xt<x} dt; (13) 



see Kutoyants [4] for details. 
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3.1. Maximum likelihood estimator 

Below we consider the problem of estimation of the drift function of the observed diffusion 
process described by 

dX t = S{ 1 ,X t )dt + o{X t )dW u X o = x,0<t<T. (14) 

The trend coefficient S(-) is supposed to be known up to an unknown parameter 7 G T = 
(a, [3). We suppose that the function ct(-) is known and positive, that the conditions (2) 
and (3) are fulfilled for all S(-) = S*(7, ■), 7 G T, and that the equation (14) has a unique 
weak solution. Therefore, the process X t , t > 0, has an ergodic property and the invariant 
density function is 

/(7,y) = G( 7 )- 1 ( r(j/)- 2 exp{2 [" ?^dv 

with G( 7 ) = G(S( 7 ,-))- 

The parameter ■& is now a function of 7: 



0(7)=/ F(y)f( 1 ,y)dy, 7 G V. 
Set 9 = {0 : 1? = 0(7), 7 G T} and denote 

the derivative of 0(7) with respect to 7. Here £ and C are two independent random 
variables with the same density function /(j, •) and the dot means derivative with respect 
to 7. 

The regularity condition will be the following: 

Condition CI. The function S(-y, x), x G R, 7 G T , has two continuous bounded deriva- 
tives on 7, the function o~(v) 2 > k\ > for some K±, the Fisher information 

is uniformly positive, inf 7e rl(7) > 0, and the derivative of the function 0(7) is separated 
from zero, 

0< inf 10(7)1 < sup |0(7)| < 00. (15) 
7er 7er 
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Under this regularity condition, the family of measures {P 7 ,7 6 T} is LAN at any 
point 70 G r and we have the Hajek-LeCam lower bound on the risks of all estimators 
for the loss functions £(u) = \u\ p 7 p > 1, that is, 

lim lim inf sup E 7 [£(T 1/2 ( 7t - 7))] > E^I^o)- 172 )], (16) 

S^0 T _ oo «?t | 7 - 7o |<<5 

where £(77) = W(0,1) (see Kutoyants [3], Theorems 3.3.8 and 3.3.4). 

Therefore, the asymptotically efficient estimator in this parametric estimation problem 
will be defined as follows: 

Definition 2. We say that an estimator tt is asymptotically efficient for the loss func- 
tion £(■) if the equality 



lim lim sup E 7 ^(T 1/2 (7 T -7)) = E^(77l(7o)" 1/2 ) (17) 

S^OT^oo | 7 _ 7o | <(5 



holds for all 70 £ T . 

By condition (15), the above optimality is equivalent to the optimality of an estimator 
for #(7): let us put = t$>(7 )~ 2 I(7o). 

Definition 3. The estimator -&t is asymptotically efficient for the loss function £(■) if 

hm lim sup E s ^(T 1 / 2 (i^t — #(7))) = ^(r/I^y 2 ). (18) 

s - >0T - >oo |7— ro|«5 

The condition $(7) ^ yields one-to-one mapping <d <-» T and the maximum likelihood 
estimator of the parameter $ is $t — $(7r), where tt is the MLE of the parameter 7 
defined by the equation 

L(^ T , ll ,X T )^supL( 1 , ll ,X T ). (19) 

Here 71 is some fixed value and the conditional likelihood ratio is given by the formula 
(sec Liptser and Shiryaycv [7]) 

(T) 

L( 7 ,7i,* T )^^(* T ) 
dP 7l 



S( % X t )-S( lu X t ) 



exp<j / - — _ lv — - dX t 



T [5( 7 ,X t ) 2 -^(7i,A t ) 2 ] ( r(X t )- 2 dH. 
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It is known that the MLE jt is uniformly consistent, asymptotically normal 

£ 7 {TV^ T - 7 )}^ .A^Ifr)- 1 ) 

and asymptotically efficient for the polynomial loss function (see Kutoyants [5]). 
These properties of jt immediately give the consistency, asymptotic normality 

and asymptotic efficiency of the MLE {>t = i?(7t)- 

Therefore, this approach gives us the asymptotically efficient estimator of the moment 
but it has the following disadvantages. The calculation of the MLE according to 

its definition (19) and (20) requires the calculation of stochastic integrals, which arc in 

general not continuous with respect to the uniform topology, as well as maximization of 

certain nonlinear functional of observations. 



3.2. One-step estimator based on the empirical estimator 

In this subsection, we propose another estimator, which is much easier to calculate and 
nevertheless is asymptotically efficient in the sense of Definition 2 for the polynomial loss 
functions. 

Note that the empirical estimator 

T 



r T = ^J F(X t )dt 



given in Section 1 is consistent (with probability 1) and is asymptotically normal 
C,{T l /\r T 0(7))} =^(0,1(5(7, Or 1 ), 

where 1(5) is defined in (6). Below we will improve this estimator to an asymptotically 
efficient one. We suppose as well that the equation 0(7) = •& can be solved with respect 
to 7 and we have the function 7 = 7(0) too. This equation can be solved preliminarily, 
say, numerically because it does not depend on observations and by the condition (15) 
the solution always exists. 

For each locally integrable function a : R — ► R, define 

G a (x, 7 ) = -J o G(S( 7 ,-))Ps M (y)[J y 2o(v)/s (7 , (v)dt;)dy, 

where 

Ps(rr,-)(y) = cx p( ~ 2 [ a-(y)" 2 S , (7,'y)d'i; 
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if J °° |a(i>)|/s( 7i .)(?;) dv < oo (cf. Kutoyants [5] or Yoshida [13] for the estimate of the 
Green function). Moreover, define a family of functions 

C 7 = |aeC T (R):^ a (.T)/ s(7i . ) (x)d.T = 0,[a],G a (-, 7 )eC T (R)|, 

where 

[a] = -aVG _ (ai/g( ^. ) )(-,7), (a,/ s(7) .)} = J a(x)f S (~, t .)(x)dx 

and C| (R) is the space of continuous functions of at most polynomial growth. 
Put 

A T ( 7 ,X T ) = ^= [ T (S( 7 ,X t )a'(X t )a(X t ) 
yl Jo 

- 5( 7 , X t )S(j, X t ) - I5'( 7 , X t )<7(X t ) 2 )<7pQ)- 2 df (21) 

and define the estimator 

A T ( 7 J,X T ) 

7t = 7r + 



where 7J = 7(1?^). This is the so-called one-step MLE introduced for LAN families by 
LeCam [6]. It improves a consistent estimator to an efficient one. Note that the one-step 
estimator 

,~ r = tf* + ,^)AT(7^ r ) 



I(7t)VT 

coincides with $(tt) up to first order. Thus, first we will consider the former estimator 
in the sequel. However, "fx is constructed based on our empirical estimator and it is 
different from the one-step estimator treated in Kutoyants [5] . 
We will use the following assumptions: 

Condition C2. (i) There exists a constant C such that for 7i(a;,7) = di f d l x S{^,x), 0< 
i < 1, < j < 3 and o~(x) , 

sup\H(x^)\<C(l + \x\ c ) 

7 

for all icR. 

(ii) 5(7, •) GS(p,A,L) for some constants p, A, and L. 

(iii) For a 7 :~ 5( 7 , -) 2 <7(-)~ 2 — 1(7), a 7 G C 7 for every 7 and 

sup E 7 [| G ay sup E 7 [[a 7 ] 2 (0] < 00 . 

7 7 
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Theorem 3. Suppose that CI and C2 are fulfilled. Then the estimator 7 t based on the 
initial empirical estimator is uniformly consistent, asymptotically normal 

£ 7 {T 1 / 2 ( 7T ~ 7 )}=^AA(0,I( 7 )- 1 ) 

and asymptotically efficient for the polynomial loss function. In particular, one-step es- 
timator $t based on the empirical initial estimator has the same asymptotic properties 
with asymptotic normality: 

£ 7 {VT(tf r -tf( 7 ))}=^(0,I^). 

Remark 2. The result may seem to be a corollary to a known general result. However, 
L p integrability of the estimator comes from a particular property of the initial estima- 
tor such as the empirical estimator we adopted here. Moreover, even if they have the 
same first-order asymptotic property, one-step estimators are different from each other, 
depending on the choice of the initial estimator. It will be understood more clearly if we 
consider the second-order asymptotics. 



Proof of Theorem 3. Denote the true value of 7 by 7 o and use 7 as a variable. Define 
a random field 



K T {i,x T ) 



1 



T c 



S{l,X t ) 

VTJo °~(x t y 



[dX t -S(~/,X t )dt\. 



Then, under P, 



(T) 



K T {i,x T ) 



1 



T c 



S(l,X t ) 



T 5( 7 , ^ 



dW t 



1 



TJo o{X t )* 



[S(j,X t )-S(j ,X t )}dt 



d V — L 



T r 



S(7,X t ) 



1 

VfJ Xo <?{y) 2 ^ ^ 



S( l0 ,X t )dt 



1 



2VTJ0 



1 



Or 



a(x) 2 

T s(i,x" 



x=X t 



°-(x t ) 2 dt 



tJo *(x t y 



[S^,X t )-S( l0 ,X t )}dt 



X 



T S(j,y) 

VtJx °-(y) 2 



dy 



T r 



s{j,X t 



S^,X t )S^,X t ) 



°{x t y 



dt, 
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a{x) 2 



s(-y,x) + 



a(x) 2 



a(x) 



5(7, x)a' (x)a(x) 



-S'(^,x)a(x) 2 - S(y,x)S(-y > x) 



we have 



A T ( 7 ,X T ) = -= 



x -1 



tJx °(y) 2 



dy + A T ( 7 ,X T ). 



(22) 



The right-hand side of (22) does not involve Ito stochastic integrals, so it provides a 
smooth version of the random field At(7,^ t )- 
Put 



p(x,j) = 
It is easily seen that under , 

y/T(fy T - 70) = VT{~1t - 70) 



S{i,y) , 

— dy. 



<r(y) 



^p + _i_ [aT(7 .,^,-A T(70 ,X-) 



T r 



where Rt{io) = Ri,t(io) + R2,t(jo) with 



dW t + R T ho), 



1 



-Ri,r(7o) = - T . - t= \p( x t ,jo)-p(X , 70)] 
I(7t)v t 



and 



i? 2 ,T(7o) - Vt( 7 t - 70) + [A t (7t, - A T ( 7o , X T ) 
It follows from Condition C2 that for T > 1 , 

su P E 7o p 1)T (7o)|]<^=. 

70 V-l 
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Next, we will estimate the second residual i?2,r(7o)- Denote by L 1 the generator of 
the diffusion process X t that corresponds to the parameter 7, 

L7 = 5(7 ,x)A + i a(x)2 _i_ 

and put M (7, x) = 9 7 (L 7 p)(x, 7 ). Then obviously M(j,x) = L 1 p(x, 7) + 5( 7 , x) 2 a(a;)~ 2 . 
We have 

i? 2 ,T(7o) = ^1^^(7^-70), 



where 



fl3,r(7b) = -^ M( 7 £*,X t )dt + I( 7 £). 



Then R 3 ,t(io) = X«=4 -^^(70) with 

#4,x(7o) =I(7t)- i (7o), 



1 " T 



i?5,r(7o) = -^ / ^(7o,^ t ) 2 T(^ t ) _2 dt + I(7o), 



1 " T 







^0 



#7,t(7o) = — / (i 7o p)(7o,^t) di , 



and 



^8,t(7o) = -i / [(£ 7 «p)(7r ,*t) " (i 7o p)(7o,^ t )]dt. 



By using the uniform non-degeneracy of #(7) (Condition CI) and the uniform estimate 
for 1^, we obtain, for every p > 1, 

supE 70 [|Vr( 7 £ - 7o )| p ] < fsup|tf( 7 )|V sup E 70 [|VW - 0o)fl < C p < 00 

70 \ 7 / 70 

for all T £ R + . Because sup 7 |I(7)| < 00 as a consequence of Condition C2, we obtain for 
i = 4, 

supE 70 [|-R i , T (7o)|]<^=. (23) 

70 V-l 

In a similar fashion, we obtain the estimate (23) for i = 6,8. Also, R§j? and Rt,t can be 
estimated by using Ito's formula, the Burkholder-Davis-Gundy inequality and Condition 
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C2. Thus, Rt(io) is estimated uniformly in 70. Finally, the uniform central limit theorem 
for the principal term of the Vt^t — 7o) implies the desired result. 

The assertion for $ is an easy consequence. □ 

Remark 3. If we are interested in estimating the distribution function [see Remark 1 
with -d = D(7,x)] and if Dt{x) is its empirical distribution function (13), then according 
to Theorem 3, the estimator 



will be asymptotically efficient in the following sense: 

lim lim sup E 7 £(T 1 / 2 (L» T (x)- J D(7,a;)))=E€(7 ? I D 1/2 ). 

^Ot^qo | 7 _ 7o |< 5 

Here 7^ is the solution of the equation D(^^,x) = Dt(x), and 1(70) and Id are the 
corresponding Fisher informations (see Kutoyants [5] ) . 



4. Asymptotic expansion 

We are still considering the diffusion process X t that satisfies the stochastic differential 
equation (1) and the nonparametric estimator for the expectation $ = E[P(£)]. As shown 
in the previous sections, 

r T = ^j\{x t )dt 

is asymptotically normal and asymptotically efficient in a nonparametric sense. In this 
section, we shall investigate the higher order distribution of our nonparametric estimator, 
more precisely, the asymptotic expansion of its distribution will be presented. 

The asymptotic expansion of the distribution of ergodic diffusions recently was ob- 
tained by Yoshida [12, 13, 14] applying the Malliavin calculus. Among two possible 
methodologies, that is, the global approach and the local approach, here we shall take the 
newly developed local approach formalized by Kusuoka and Yoshida [2] for continuous- 
time processes and applied by Sakamoto and Yoshida [8] . The support theorems serve to 
verify the non-degeneracy (see Yoshida [14]). 

Let C|°(R) be the space of smooth functions on R, all derivatives of which are of at 
most polynomial growth, and let C|? (R) be the space of bounded smooth functions on R 
with bounded derivatives. We denote by BQ the set of bounded C/-measurable functions. 
We assume that F,S G Cf°(R), S", er € Cf (R), and a(x) > for any ieR, and that X t 
is stationary. We may construct a solution Xt over a (partial) Wiener space (f2, P), that 
is, £1 = R x W, where W = {w: R + — > R, continuous w(0) = 0}, and P = v <£> P, P being 
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a Wiener measure on W and v being the stationary distribution of the diffusion process. 
Let 

B? =a[X t :teI] V7V 

for I G R+, Af being the er-ficld generated by null sets. We later use the following condi- 
tions: 

Condition Al. There exists a positive constant a such that 



for any s,i G R+, s < t, and for any h G BB X 



mBU-nh]\\i<°>- l *- a{t - s) \\H 

X 

[*,«>)■ 



Condition A2. Process X G f| P >i L P (P). 

A sufficient condition for Al is provided, for example, in Veretennikov [10, 11] and 
Kusuoka and Yoshida [2], Indeed, if a G C|?(R) and if there exists a function p G C°°(R) 
such that p > 0, J- R p(x)dx = 1 and limsup| a .|_ (00 p(x)~ 1 L*p(x) < 0, where L* is the 
formal adjoint operator of the generator L of this diffusion process, then Condition Al 
holds true (see Kusuoka and Yoshida [2]). 

Put 

rT 



Z T = [ q(X t )dt, 
Jo 



where q(x) = F(x) — *&. As in Kusuoka and Yoshida [2], we define the rth cumulant 
function of Zt/Vt by 



exp I isu • 



Then define function 

k 

* T ,*(u) = ex P (i X T, 2 (w)) + ^r r / 2 p T , r w, 

r=l 

where Pjv(u) are defined by the formal Taylor expansion 

Denote by Wy.fc the signed measure defined as the Fourier inversion of \E , T,fc- For mea- 
surable function h : R — > R, let 

u>(h,r)= / sup{|/i(x + y)-/i(x)|:|?/| < r}0(a;; 0, I^ 1 ) d.T, 
Jr 

where </>(x; E) denotes the probability density of the normal distribution iV(/x, E) and 
i* is an arbitrary positive number smaller than . The Hermite polynomials are defined 
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by 

h k (z; E) = (-l) k cb(z; 0, S)' 1 ^^; 0, E) 

for a positive constant E. 

For a continuous function a : R — > R, we define G a : R — > R by 

G.(x)= ^ G{S)p{y)(f 2a(v)f(v)dv)dy, 

J — oo \J — oo J 

where = exp(— 2 ct(w) _2 S i (v) dw) if the mapping 

y^p(y)( f a(v)f(v)dv 



is in oo, 0], dy). We write (a,f) = J- R a(x)f(x)dx and 



[a] = -crVG a _ (aJ) . 



Define the set of functions 

C = |aeC T (R)| f a{x)f{x) dx = 0; 

p(.) / a( a ;)/(.T)dxeL 1 ((-co,0]);[a],G Q GC T (R) 



Theorem 4. Let fc e N and Zei M,^,K > 0. Suppose that F:R— + R is noi constant. 
Then 

(1) There exist constants S > and c > suc/i £/ia£ /or ft, € £{M, 7), 

|E[/j(VT(^ - 1?))] - * T , fc [ft]| < cwfoT-*) + 

w/iere 4? = o(T _ (* + *>/ 2 ). ffere £(M, 7) = {/1 : R -> R, measurable, \h(x)\ < 
M(l + \x\y{xeR)}. 

(2) TTie signed measure d$?T,i has a density d^T,i(z)/dz = Pt,i(z) with 

Pt,i(z) = 4>(z;0,k^)(1 + ^n^h 3 (z;K^)), 

where K)f' is the rth cumulant of v / T(i9y — Moreover, if q and [q] 2 — ([q] 2 ,f) 
belong to C, then 

Pt,i(z) =Pt,i( z ) + Rt(z), 

where 



p* TA (z)=ci } (z-Ai: i )(i+^mQrM(0}h3(z-x i ) 
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lim Vf sup {\R T (z)\ exp(bz 2 )} = 



for some positive constant b. In particular, 



E[h(VT(^ -#))}- [ h(z)p* T1 (z)d 

JR 



<CLu(h,T~ K )+s T 



for any h G £(M, 7), with ir = o(l/y/T). 

Proof. We consider the stochastic flow X(t, x) 
stochastic differential equation 



(X(t,x),Z(t,x)) that satisfies the 



dX(t,x) = V (X(t,x))dt + V(X(t,x))o dW t , X(0,x) = (x,0), 

where Vo(x,z) = (S(x) — 2~ 1 o-'(x)a(x),q(x)) and V(x,z) = (a(x),Q). By assumption, F 
is not constant; hence, there exists a point xq G R such that F'(x) 7^ in a neighborhood 
U of xq . Easy calculus shows that 

Span^o, 0), [V, V ](x , 0)} = R 2 . 

In the same argument used to prove Theorem 4 of Kusuoka and Yoshida [2] or that in 
Example 2 of the same paper, Condition [A3'] of it can be verified. Indeed, take the 
sequence u(j) =j, v(j) = j + 1, j 6 R, and let ipj = <p(Xj) for some truncation function 
(p € C°°(R; [0, 1]) with compact support in U taking value 1 near xo- For each j G N, the 
Malliavin operator Lj is constructed in the usual way so that Lj does not shift the path w 
outside of [j,j + 1]. Then it is known that for sufficiently small U, {(det Ox(i,a:)) — 1 5 x ^ ^} 
is bounded in L P (P) for any p > 1. Therefore, Condition [A3'] of Kusuoka and Yoshida 
[2] can be verified. Thus, the first assertion has been obtained. 

For the formula oipr,k with the validity, see Sakamoto and Yoshida [8, 9]. To obtain 
the last result, some calculations are involved. Given that q G C, 



1 

VTJo 



q(X t )dt = 



G q (X ) 



T 



[q]{X t )dW t 



(24) 



and E[|G,j(£)| p ] < 00 for any p > 1. Let < (3 < 1/2. The strong mixing property Al 
induces the so-called covariance inequality, which yields 



E GqiXx) ■ —1= f " [q](X t )dW t 
< 8a(S[ ,T-T' 3 ]' S [T,oo)) 1/r 



-4 / [q]{X t )dW t 



T Jtp 



\G q (X T )\\ 



< a exp 
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where a is the coefficient of the strong mixing, a is some positive constant and l/p H 
l/q + 1/r = 1. A similar estimate holds even if G q (XT) above is replaced with G q (Xo) 
Accordingly, we obtain 



.(2) 



T 



E[[ 9 ] 2 (A t )]dt+-=E 



(G q (X T )-G q (X )) 



VTJo 



[q](X t )dW t 



±E{(G q (X T )-G q (X )) 2 } 



1 



(25) 



Next, we consider the third cumulant k t . Put k = [q] and denote simply k t = k(X t 
By assumption, k E C|(R), and so we see by Ito's lemma that 



E 



IV7? L hm 



= 3T~ 3/2 E 
= 3T -3/2 E 

= 3T -1/2 E 



k s dW s k 2 dt 



\J0 
T 



\J0 



k s dW s ){k 2 t -{k 2 ,f))dt 



-L 1^ ks dw s ■ JL jf 3 ^ - (k 2 j))dt 



Because k 2 — (k 2 , f) £ C, the right-hand side equals 



E 



1 

VrJo 



Gk 2 ~(k 2 J)( X T) - Gfc2_( fc 2 J)(X ) 1 



Vr Jo 



[[q] 2 ](X t )dW^ 



E[M 2 M(C)]+o 



(26) 



In view of (24) and by a similar argument after it, it is possible to see that the cross terms 
[e.g., those between G q (X T ) — G q (X ) and (T -1 ' 2 J [q]dW t ) 2 ] have asymptotically no 
contribution to n T . It follows from (25) and (26) that 



supe bz \p T ^(z)-p* T1 (z)\=o 
zeR 



for some positive constant b. This completes the proof. 



□ 
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